Methods for Handling Multiple Outcomes in Health Data

Sivakamakshi Muthu Kumarasamy1, Shivadarshini Sreekanth1

1 School of Mathematical and Statistical Sciences, University of Galway

Background and Problem

Health research often involves multiple outcomes (e.g., survival, disease progression, quality-of-life scores). Traditional approaches analyse outcomes separately, implicitly treating outcomes as independent and potentially missing biological and temporal links. Alternatively, composite endpoints can be used but may hide heterogeneity in clinical importance, frequency, and treatment response. As a result, both predictive performance and clinical interpretability may be compromised.

Emerging Methods

  • Clinical trials: Win-ratio preserves the hierarchy of endpoints importance while combining the analysis of the affect.
  • Prediction: Multi-state models can be used to model a process where subjects transition from one state to the next. And multi-task models are deep learning models which use common pathways to predict multi outcomes of interest.

Objectives of Project

Compare strategies for both clinical trial design and prediction modelling that treat outcomes independently, as composite endpoints, and via hierarchical/multi-state frameworks (with potential extension to multi-task modelling).

Data Source

Dataset Overview

  • SECOMBIT Trial: Phase II randomised; N ≈ 209 metastatic melanoma; 3 arms.
  • Endpoints: Overall Survival (OS) and Progression Free Survival (PFS).
  • Covariates: Sites, Lactate Dehydrogenase (LDH), Tumour Mutational Burden (TMB), a genetic biomarker (JAK).

This data set is used for both clinical trial and predictive modelling. In this clinical trial, the three treatment arms were considered as different combinations of first-line treatment strategies with a primary otucome of Overall Survival and secondary outcomes included Progression-Free Survival.

Preliminary Analysis

Early Results and Descriptive Statistics

Preliminary time-to-event analyses was conducted for Overall Survival (OS) and Progression-Free Survival (PFS) separately. Survival distributions are visualised using Kaplan–Meier plots. There was no evidence of a difference in OS between three treatment arms (log-rank test : chi-squared = 2.8, p-value = 0.20). Cox proportional hazards models are fitted to estimate covariate-adjusted effects, and as a preliminary predictive modelling strategy.

Results are interpreted in both clinical and methodological contexts, highlighting the limitations of analysing multiple outcomes independently and motivating integrated modelling approaches.

Baseline Summary Table (by Arm)

A
(N=69)
B
(N=69)
C
(N=68)
Sites
1 - 2 43 (62.3%) 40 (58.0%) 42 (61.8%)
>= 3 26 (37.7%) 29 (42.0%) 26 (38.2%)
LDH
Normal 41 (59.4%) 39 (56.5%) 47 (69.1%)
Elevated 26 (37.7%) 28 (40.6%) 20 (29.4%)
Missing 2 (2.9%) 2 (2.9%) 1 (1.5%)
TMB
< 10 20 (29.0%) 17 (24.6%) 18 (26.5%)
>= 10 8 (11.6%) 8 (11.6%) 12 (17.6%)
Missing 41 (59.4%) 44 (63.8%) 38 (55.9%)
JAK
Wild Type (Normal) 24 (34.8%) 17 (24.6%) 16 (23.5%)
Mutated 5 (7.2%) 7 (10.1%) 14 (20.6%)
Missing 40 (58.0%) 45 (65.2%) 38 (55.9%)

Kaplan - Meier Plot

Cox model — Overall Survival (OS)
Variable HR (95% CI) p-value
ArmB 0.98 (0.38–2.50) 0.9590
ArmC 0.83 (0.33–2.07) 0.6900
Sites>= 3 2.02 (0.93–4.42) 0.0764
LDHElevated 0.96 (0.43–2.15) 0.9130
TMB>= 10 0.72 (0.31–1.67) 0.4480
JAKMutated 0.63 (0.24–1.63) 0.3380

Cox model — Progression-Free Survival (PFS)
Variable HR (95% CI) p-value
ArmB 0.73 (0.30–1.77) 0.4870
ArmC 1.03 (0.48–2.22) 0.9470
Sites>= 3 1.60 (0.80–3.20) 0.1870
LDHElevated 1.14 (0.56–2.34) 0.7150
TMB>= 10 0.85 (0.40–1.78) 0.6660
JAKMutated 0.41 (0.16–1.03) 0.0571

Log-rank tests

Endpoint Chi.square df p.value
OS 2.80 2 0.2460
PFS 7.74 2 0.0208

PH Test

Endpoint PH.test.p.value
OS 0.0581
PFS 0.0119

Future Work

  • Compare independent outcome assessment with composite outcome approach
  • Assess the potential of hierarchical endpoint analysis such as the Win-Ratio
  • Explore multi-state methods and multi-task learning approaches for multi-endpoint risk prediction
  • Apply methods to more complex dataset

GitHub

The code and datasets for this project can be viewed at our GitHub repository here: https://github.com/darshu-d/MSc-Research-project-

References

Ascierto PA et al., SECOMBIT 4-year results. Nat Commun (2024).

Ascierto PA et al., SECOMBIT sequencing trial. J Clin Oncol (2023).

Taylor BS et al., Integrative genomic profiling of prostate cancer. Cancer Cell (2010)..